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ABSTRACT 



Under certain conditions the state space of a discrete parameter 
Markov Chain may be partitioned to form a smaller "lumped" chain that 
retains the Markov property. The problem of formulating lumpability hy- 
potheses when the transition probability matrix P is not known and, 
hence, must be estimated is discussed. An approximate test of these hy- 
potheses is described based on well known non-parametric methods. The 
procedure is illustrated by an example. 
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INTRODUCTION 



Under certain conditions the state space of a discrete parameter 
Markov Chain may be partitioned into subsets of states each of which 
may be treated as a single state of a smaller chain that retains the 
Markov property. Such a chain is said to be n lumpable ff or ’’weakly 
lumpable", depending upon conditions, and elsewhere has been referred 
to as a "mergable process” [3] and a "chain with collapsed states" 

[2]. The resulting smaller chain is called a "lumped chain"; conditions 
allowing lumping are discussed in detail by Kemeny and Snell [4]. A 
practical problem arises in examining lumpability conditions when the 
matrix of transition probabilities P is not known, but a number of 
transitions have been observed. Billingsley [1] discusses related 
problems of statistical inference for Markov Chains, but statistical 

considerations of lumpability have apparently not been investigated. 

/ 

In particular, we consider an aperiodic Markov Chain 
{Xj.: t = 0, 1, 2, ...} with finite state space S = {1, n} 

and stationary transition probability matrix P = [p^]. It is con- 
venient to restrict ourselves to the case where {X } is irreducible. 

4 t 

Thus, the chain is described by P , a vector of steady state proba- 
bilities it = (n, it ) , and possibly an a priori distribution 

1 n 

of initial states, p^. Given observations of k transitions of this 
chain we obtain a matrix of transition counts [n_] where n^. is 
the number of transitions into state j from state i, and, of course, 

n n 



l l 

i-l j=l 



n 



ij 



= k. 



Maximum likelihood estimators for the one-step transition probabilities 
are given in [1] 

^ n 

P U ' Vji "ij * n lj /n l-' (1) 

where n^ % is the observed frequency with which the process visited 

state i. If a lumpability hypothesis (which we discuss below) was 

formulated independent of the sample of k transitions, it could be 

2 

tested in terms of the asymptotic X -theory involving differences 

between observed and expected frequencies [1], [6]. More often in 

practice, however, the hypothesis to be tested is suggested by the 

sample of observed transitions. Technically, this gives rise to a 

problem of the simultaneous inference type [5]; for "large" samples, 

the magnitude of the error (usually lower than the desired size and 

power) is insignificant. Thus, in approximate tests, based on asymptotic 

distributions of the test statistics, the effect of formulating the 

hypotheses to be tested with the aid of the data to be used for the 

test is usually ignored. In what follows we shall discuss the use of 
2 

asymptotic x -theory in testing hypotheses of lumpability. We begin 
under the assumption that the null hypothesis has been determined, 
perhaps through reasoning about the physical system being modeled, or 
perhaps through preliminary examination of data from {X^.}. Later, in 
section 4, we make some comments about the problem of hypothesis 
formulation. 



1. LUMPABILITY HYPOTHESES 
Consider an n-state Markov Chain (X^_ : 



t = 0, 1, 2, ♦ . . }. 
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Formally, we have the following: 



Definition : {X t } is lumpable with respect to a partition "s «= 

L^, ...» L^} of S, where m < n, if for every initial state 



probability vector the resulting chain {x } is Markovian and 



~0 



the transition probabilities p are invariant under choices of p 



A necessary and sufficient condition for {X^} to be lumpable with 
respect to a partition S of S is that for each pair (L^ , L^.) , 
the probability of transition from k to some i 6 L . is the same 
for each k£ L. (Theorem 6.3.2 [4]). We shall use this character- 
ization in stating hypotheses of lumpability. The resulting lumped 
chain {x^} will be Markovian with transition probabilities P^j > 
where, for each k 6 

P - • - I Pi, n J i > j = 1 > . . . , m. (2) 

1J J£L. K 
J 



The steady state probability vector tt = (tt , ..., tt^) of {x^} has 
components 



= l J j = 1, . , m, 

2 L. 16 
J 

and the corresponding prior p^ is similarly determined from p by 
pooling over the states in the L_/s. 

To illustrate a lumpable Markov Chain, consider a 5-state chain 
with transition probabilities [p^. : (i, j) = 1, ...> 5]- Suppose that 
this chain is lumpable into S = { { 1 } , (2, 4}, {3, 5}} = ^2* ^3^* 

Then the transition probability matrix for {x } is given by 
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P 



P 11 P 12 + p l4 P 13 + p 15 

P 21 P 22 + P 24 P 23 + P 25 * 

P 31 P 32 + p 34 P 33 + P 35 

— 



It 



follows f 



P 21 = P 



P 31 = P 



rom (2) that 



41* 


P 22 + P 24 


p 42 + p 44’ 


P 23 + P 25 


p 43 + p 


51* 


P 32 + P 34 = 


p 52 + p 54’ 


P 33 + P 35 = 


P 53 + p 



45 * 
55* 



(3) 



(4) 



Here, {x^} w iH be Markovian for an arbitrary choice of initial state 
probability vector, Burke and Rosenblatt [2] give weaker lumpability 
conditions that apply whenever there exists at least one choice of 
p^ such that {x^} is Markovian, In either case, in practice one 
makes conjectures (in the form of hypotheses) about combining certain 
states, which result in forming postulated probability transition 
matrices (of lumped chains) , which in turn satisfy conditions such as 
those in (3) characterizing lumpability into these combined states. 



2. TEST OF LUMPABILITY 

Let us denote the hypothesis that {X^_ } is lumpable into S = 
{L^, L^} by the partition S itself, and suppose we take as the 

alternate the composite hypothesis that {X^} is not lumpable into S: 

{L- , ,,,, L } v . s . H • not - { L _ , L }, 

0 1’ * m a 1 m 

With the characterization of lumpability discussed above, Hq is 
equivalent to stating that, in addition to satisfying the conditions 
of a stochastic matrix, [p^j ] satisfies conditions (2). 
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The random variable 



n n (n - n. p , . ) ‘ 

i i -y — y-y- 



i-i j-i 



n . p 

i- *ij 



( 5 ) 



is asymptotically (as k -* «>) x -distributed with n(n - 1) degrees 

of freedom ([1], Theorem 5.3). However, the p_^_. are unknown, so we 

must apply the well known procedure of reducing degrees of freedom to 

account for estimation of parameters in (5). For example, Roy [6] 

(Theorem 5, page 126) states the rule in a form appropriate for the 

current context. We take the random variable (5), with the p_'s 

replaced by the corresponding maximum likelihood estimators p _ , as 

the test statistic. is rejected if the calculated value of this 

2 

statistic falls above the tabulated X quantile corresponding to 
the desired level of significance, a. Note that in calculating the 
p^ we must use the constraints, such as those given in (4) for our 
example, corresponding to Hq. In addition, we must use the constraints 
to determine the appropriate reduction in the* degrees of freedom. 

For a given null hypothesis {L^, •••> let denote t ^ e 

number of the original n states present in . By proper initial 
choice of labels for the states in S, it is possible to state the 
null hypothesis in the form 

H q : S = {-{1, 2, , X 1 }, + 1, ..., A 1 + X 2 >, ..., 

m-1 

{ £ X, +1, •••> n}}. 



We wish to estimate the ri parameters p_ subject to 



l p -1; i - 1, 2, .... n 



j = l 
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(n constraints), and 



I P n . «. “ l P,.i H ; i* + i" both in L ; i = 1, 2, m, (6) 

J6L s 13 J€L s 

m 

adding £ (X - l)*m « m(n - m) constraints. Thus, using Roy’s rule- 
i=l 

of-thumb, the number of "independent” parameters we need to estimate is 
2 

n - n x - m(n - m) , so the degrees of freedom of the test statistic is 
2 2 

simply n - [n - n - m(n - m) ] = n + m(n - m) . 

The’ maximum likelihood estimators p , . of the p, . under the 

ij *3 

above constraints can be derived using Lagrangian multipliers with the 

log likelihood function 7 n . . log p... The form of these estimators 
& L ±2 & *ij 

1 y J 

have the following intuitively appealing interpretation: suppose 



kt L and q € L , in order to estimate p. , first form a maximum 
i n s kq 

likelihood estimate of £ p, , , where L contains q. By equation 

j€L s 

(6), it is not surprising that this estimate turns out to be a "pooled" 
estimate, 



l 



p kj = 



l l n. . 

k6L. j£L ^ 
1 J s 

l V 

k£L. K 
1 



(7) 



where, as before, n^ = £ n^ . . The proper allocation of the combined 

J 2 

estimate (7) over the individual cells p, . , for each j € L , is 

Kj S 

obtained by weighting (7) by the relative frequencies n, . / £ n, , . 

2 j€L s 2 

The maximum likelihood estimates of the are thus given by 



P kq 



l l n 

k6L ± jfeL g 



kj 



n. 









k ^ 



( 8 ) 
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Replacing the unknown in (5) by their estimates p given above 

2 

results in a test statistic which is distributed approximately X 
with n+ m(n - m) degrees of freedom. 

In summary, the procedure for conducting a test of the hypothesis 
S of lumpability, at approximately the a-level of significance, is as 
follows : 

1. Use the observed record {x, , x_, .... x ,,} to form the 

1 2 n+1 

transition frequency matrix (n^). 

2. Compute the estimates p^. given in (8). 

3. Calculate the value of the test statistic (5) , with p 
in place of the unknown p . 

4. Reject the hypothesis of lumpability if the calculated 

value of the test statistic exceeds the tabulated (1 - a) th quantile 
2 

of the x -distribution with n + m(n - m) degrees of freedom. 

3. A NUMERICAL EXAMPLE 



Consider a special case of our earlier example, where 

.3 .1 .2 .1 .3 

.1 .3 .1 .3 .2 

.5 .1 0 .1 .3 

.1 .5 .2 .1 .1 

.5 0 .1 .2 .2 

with S = {{1}, {2, 4}, {3, 5}}, so 



.3 .2 

.1 .6 

.5 .2 



.5 

.3 

.3 
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We generated 1000 transitions with P, using a table of random numbers, 
resulting in the frequency matrix 



84 


31 


52 


31 


112 


22 


46 


13 


54 


33 


69 


9 


0 


13 


31 


14 


83 


33 


23 


16 


118 


0 


23 


48 


42 



Imagine P is unknown, and we wish to use the data in (n_jj) to test 
H^: S* The usual (without lumpability constraints) maximum likelihood 

estimate of P is given by 



.27 


.10 


.17 


.10 


.36 


.13 


.27 


.08 


.32 


.20 


.57 


.07 


0 . 


.11 


.25 


.08 


.49 


.20 


.14 


.09 


.51 


0 . 


.10 


.21 


.18 



Under the hypothesized lumpability conditions, the matrix of estimates 

<p ij ) is 



270 


.100 


.170 


.100 


.360 


107 


.281 


.080 


.330 


.202 


530 


.081 


0 . 


.117 


.272 


107 


.479 


.190 


.133 


.091 


530 


0 . 


.096 


.198 


.176 
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The value of the test statistic is 3.03, which falls well below the 
2 

a = .05 X critical value 19.68 with 5 + 3(5 - 3) = 11 degrees of 
freedom. We would thus conclude the observed data is consistent with 

i 

the hypothesis of lumpability, in the sense that the test value is not 
significant at the .05 level. Of course, in this case with P known, 

Hq is known to be true; the M test M is simply an illustration of how 
we would have proceeded if P had not been known. 

4 . COMMENTS 

We have discussed a test of a given lumpability hypothesis; the 
problem of using the observed data both to formulate the hypothesis as 
well as test it has been mentioned. Even if one disregards this problem, 
there is a very significant problem in how to use the data to formulate 
appropriate hypotheses, A solution of this problem would be of great 
interest, for example, in large computer based information systems, 
where man's intuition is not sufficient to cope with the range of 



possible alternatives. 
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